A Comparative Study of Data Mining Techniques in Predicting Consumers Credit Card Risk in Banks

نویسندگان

  • Ling Kock Sheng
  • Teh Ying Wah
چکیده

It is increasingly important for banks to analyze and understand their risk’s portfolio, in particularly those related to credit card as it is one of those instruments that provide the highest return as well as potentially, most risky. The competitive level for the credit card business have increased significantly in the last few years due to the push by the banks to brought forth many new differentiated services and products to win over market share. Other strategies employed by banks are also to improve their current portfolio by extending the application of their services through, for instance, extending credit limit where it is deemed credible and appropriate, improve performance, for example, by predicting future payment behavior of consumers and lastly, proper management of bad debts. All such emphasis are usually managed using a typical score known mostly as credit scoring by the banks. Credit score is usually established or derived from three major sources of information and data, they are historical credit information from Credit Bureau or Central bank, current and historical transactions extracted from the bank’s database and collaterals and guarantees provided. A measure of consumer who will default in their payment or bad credit and those would not, can therefore be used to provide a good yardstick in substantiating the level of score for credit worthiness. The focus of the banks should therefore be, finding the best classifier from the model generated that provides the best predictive accuracy. There were substantial literature advocating the use of data mining techniques such as Logistic Regression, C5 and Neural Network for providing the best predictive accuracy for such cause. Prior to generating all the models for comparison, the initial set of data is also loaded to an ETL system customized to perform feature selection or attribute relevancy analysis using ID3 algorithm, compiling a subset of data with the highest information gain and gain ratio. An extended test is performed to use equal length binning on some attributes to find if it affect the relevancy of each attribute. The selected subset of data is used to generate various data mining models using different training and testing sizes. C5 emerged consistently as the technique that has generated the best models with an average of predictive accuracy at 91.64%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combination of Ensemble Data Mining Methods for Detecting Credit Card Fraud Transactions

As we know, credit cards speed up and make life easier for all citizens and bank customers. They can use it anytime and anyplace according to their personal needs, instantly and quickly and without hassle, without worrying about carrying a lot of cash and more security than having liquidity. Together, these factors make credit cards one of the most popular forms of online banking. This has led ...

متن کامل

Credit scoring in banks and financial institutions via data mining techniques: A literature review

This paper presents a comprehensive review of the works done, during the 2000–2012, in the application of data mining techniques in Credit scoring. Yet there isn’t any literature in the field of data mining applications in credit scoring. Using a novel research approach, this paper investigates academic and systematic literature review and includes all of the journals in the Science direct onli...

متن کامل

Predicting the Credit Risk of Loans Using Data Mining Tools

 One of the most common causes or credit phenomenon that is taken into account for credit risk is the customer’s noncompliance with the commitments. Thus, by predicting the behavior of loan applicants, the growth rate of debts can be decreased. Hence, this study is conducted on corporate applicants for loans in one of the public banks in Iran. In this paper, the main elements comprising the cus...

متن کامل

Detecting Suspicious Card Transactions in unlabeled data of bank Using Outlier Detection Techniqes

With the advancement of technology, the use of ATM and credit cards are increased. Cyber fraud and theft are the kinds of threat which result in using these Technologies. It is therefore inevitable to use fraud detection algorithms to prevent fraudulent use of bank cards. Credit card fraud can be thought of as a form of identity theft that consists of an unauthorized access to another person's ...

متن کامل

Credit Card Fraud Detection using Data mining and Statistical Methods

Due to today’s advancement in technology and businesses, fraud detection has become a critical component of financial transactions. Considering vast amounts of data in large datasets, it becomes more difficult to detect fraud transactions manually. In this research, we propose a combined method using both data mining and statistical tasks, utilizing feature selection, resampling and cost-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010